引力波天文学是一个充满活力的领域,它利用经典和现代数据处理技术来理解宇宙。已经提出了各种方法来提高检测方案的效率,层次匹配的过滤是一个重要的策略。同时,深度学习方法最近已经证明了与匹配的过滤方法和显着统计性能的一致性。在这项工作中,我们提出了分层检测网络(HDN),这是一种新型的有效检测方法,结合了分层匹配和深度学习的思想。使用新型损失函数对网络进行了训练,该功能同时编码统计准确性和效率的目标。我们讨论了提出的模型的复杂性降低的来源,并描述了专门在不同区域的每个层的初始化的一般配方。我们使用开放的LiGO数据和合成注射的实验证明了HDN的性能,并使用两层型号观察$ 79 \%$ $效率的增益,而匹配的过滤率则以$ 0.2 \%$ $的匹配过滤率。此外,我们展示了如何使用两层模型初始化的三层HDN训练三层HDN可以进一步提高准确性和效率,从而突出了多个简单层在有效检测中的功能。
translated by 谷歌翻译
随着工程系统的复杂性的增长,对自动方法的需求越来越多,可以检测,诊断甚至正确的瞬时异常,这些异常不可避免地会出现,并且可能难以或不可能手动诊断和修复。在我们文明的最敏感和最复杂的系统中,探测器在引力波引起的距离中寻找令人难以置信的很小的变化 - 阿尔伯特·爱因斯坦(Albert Einstein)最初预测的现象是由于黑洞和其他其他碰撞而在宇宙中涌现和传播的探测器。深空中的大量物体。此类探测器的极端复杂性和精度使它们受到瞬时噪声问题的影响,这些问题可能会大大限制其敏感性和有效性。在这项工作中,我们介绍了一种可以检测和表征这种大规模复杂系统的新兴瞬态异常的方法的演示。我们通过一个普遍的问题之一来说明自动化解决方案的性能,精度和适应性,限制重力波发现:陆地质量造影,污染了重力波观测体的高度敏感测量,并可以模仿甚至模仿的天体物理学信号他们正在听。具体而言,我们证明了高度可解释的卷积分类器如何自动学习从辅助探测器数据中检测瞬时异常,而无需观察异常本身。我们还说明了该模型的其他几个有用的功能,包括如何执行自动变量选择,以将数万个辅助数据渠道降低到只有几个相关的数据渠道;它如何识别这些通道中异常情况的行为特征;以及如何使用它来研究单个异常及其相关的渠道。
translated by 谷歌翻译
随着我们感知增强的能力,我们正在经历从数据贫困问题的过渡,其中中心问题是缺乏相关数据,即数据越来越多的问题,其中核心问题是确定一个中的一些相关功能海洋观察。通过在重力波天体物理学中应用的激励,我们研究了从检测器及其环境中丰富的测量值收集的引力波检测器中瞬时噪声伪影的存在。我们认为,功能学习 - 从数据中优化了哪些相关功能 - 对于实现高精度至关重要。我们引入的模型将错误率降低60%以上,而不是先前使用固定的手工制作功能的最新现状。功能学习不仅有用,因为它可以提高预测任务的性能;结果提供了有关与感兴趣现象相关的模式的宝贵信息,否则这些现象将是无法发现的。在我们的应用程序中,发现与瞬态噪声相关的功能提供了有关其起源的诊断信息,并建议缓解策略。在高维环境中学习具有挑战性。通过使用各种体系结构的实验,我们确定了成功模型中的两个关键因素:稀疏性,用于在高维观测中选择相关变量;和深度,这赋予了处理复杂相互作用和相对于时间变化的鲁棒性的灵活性。我们通过对实际检测器数据进行系统的实验来说明它们的意义。我们的结果提供了对机器学习社区中常见假设的实验性佐证,并具有直接适用于提高我们感知引力波的能力以及许多其他具有类似高维,嘈杂或部分无关数据的问题的问题。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
Learning feature interactions is the key to success for the large-scale CTR prediction and recommendation. In practice, handcrafted feature engineering usually requires exhaustive searching. In order to reduce the high cost of human efforts in feature engineering, researchers propose several deep neural networks (DNN)-based approaches to learn the feature interactions in an end-to-end fashion. However, existing methods either do not learn both vector-wise interactions and bit-wise interactions simultaneously, or fail to combine them in a controllable manner. In this paper, we propose a new model, xDeepInt, based on a novel network architecture called polynomial interaction network (PIN) which learns higher-order vector-wise interactions recursively. By integrating subspace-crossing mechanism, we enable xDeepInt to balance the mixture of vector-wise and bit-wise feature interactions at a bounded order. Based on the network architecture, we customize a combined optimization strategy to conduct feature selection and interaction selection. We implement the proposed model and evaluate the model performance on three real-world datasets. Our experiment results demonstrate the efficacy and effectiveness of xDeepInt over state-of-the-art models. We open-source the TensorFlow implementation of xDeepInt: https://github.com/yanyachen/xDeepInt.
translated by 谷歌翻译
In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.
translated by 谷歌翻译
Witnessing the impressive achievements of pre-training techniques on large-scale data in the field of computer vision and natural language processing, we wonder whether this idea could be adapted in a grab-and-go spirit, and mitigate the sample inefficiency problem for visuomotor driving. Given the highly dynamic and variant nature of the input, the visuomotor driving task inherently lacks view and translation invariance, and the visual input contains massive irrelevant information for decision making, resulting in predominant pre-training approaches from general vision less suitable for the autonomous driving task. To this end, we propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving. We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos. The proposed PPGeo is performed in two stages to support effective self-supervised training. In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input. In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only. As such, the pre-trained visual encoder is equipped with rich driving policy related representations and thereby competent for multiple visuomotor driving tasks. Extensive experiments covering a wide span of challenging scenarios have demonstrated the superiority of our proposed approach, where improvements range from 2% to even over 100% with very limited data. Code and models will be available at https://github.com/OpenDriveLab/PPGeo.
translated by 谷歌翻译
Interview has been regarded as one of the most crucial step for recruitment. To fully prepare for the interview with the recruiters, job seekers usually practice with mock interviews between each other. However, such a mock interview with peers is generally far away from the real interview experience: the mock interviewers are not guaranteed to be professional and are not likely to behave like a real interviewer. Due to the rapid growth of online recruitment in recent years, recruiters tend to have online interviews, which makes it possible to collect real interview data from real interviewers. In this paper, we propose a novel application named EZInterviewer, which aims to learn from the online interview data and provides mock interview services to the job seekers. The task is challenging in two ways: (1) the interview data are now available but still of low-resource; (2) to generate meaningful and relevant interview dialogs requires thorough understanding of both resumes and job descriptions. To address the low-resource challenge, EZInterviewer is trained on a very small set of interview dialogs. The key idea is to reduce the number of parameters that rely on interview dialogs by disentangling the knowledge selector and dialog generator so that most parameters can be trained with ungrounded dialogs as well as the resume data that are not low-resource. Evaluation results on a real-world job interview dialog dataset indicate that we achieve promising results to generate mock interviews. With the help of EZInterviewer, we hope to make mock interview practice become easier for job seekers.
translated by 谷歌翻译
When using LiDAR semantic segmentation models for safety-critical applications such as autonomous driving, it is essential to understand and improve their robustness with respect to a large range of LiDAR corruptions. In this paper, we aim to comprehensively analyze the robustness of LiDAR semantic segmentation models under various corruptions. To rigorously evaluate the robustness and generalizability of current approaches, we propose a new benchmark called SemanticKITTI-C, which features 16 out-of-domain LiDAR corruptions in three groups, namely adverse weather, measurement noise and cross-device discrepancy. Then, we systematically investigate 11 LiDAR semantic segmentation models, especially spanning different input representations (e.g., point clouds, voxels, projected images, and etc.), network architectures and training schemes. Through this study, we obtain two insights: 1) We find out that the input representation plays a crucial role in robustness. Specifically, under specific corruptions, different representations perform variously. 2) Although state-of-the-art methods on LiDAR semantic segmentation achieve promising results on clean data, they are less robust when dealing with noisy data. Finally, based on the above observations, we design a robust LiDAR segmentation model (RLSeg) which greatly boosts the robustness with simple but effective modifications. It is promising that our benchmark, comprehensive analysis, and observations can boost future research in robust LiDAR semantic segmentation for safety-critical applications.
translated by 谷歌翻译